Semi-supervised Collaborative Text Classification
نویسندگان
چکیده
Most text categorization methods require text content of documents that is often difficult to obtain. We consider “Collaborative Text Categorization”, where each document is represented by the feedback from a large number of users. Our study focuses on the semisupervised case in which one key challenge is that a significant number of users have not rated any labeled document. To address this problem, we examine several semi-supervised learning methods and our empirical study shows that collaborative text categorization is more effective than content-based text categorization and the manifold regularization is more effective than other state-of-the-art semi-supervised learning methods.
منابع مشابه
Combining Unigrams and Bigrams in Semi-Supervised Text Classification
Unlabeled documents vastly outnumber labeled documents in text classification. For this reason, semi-supervised learning is well suited to the task. Representing text as a combination of unigrams and bigrams has not shown consistent improvements compared to using unigrams in supervised text classification. Therefore, a natural question is whether this finding extends to semi-supervised learning...
متن کاملA Novel Multi label Text Classification Model using Semi supervised learning
Automatic text categorization (ATC) is a prominent research area within Information retrieval. Through this paper a classification model for ATC in multi-label domain is discussed. We are proposing a new multi label text classification model for assigning more relevant set of categories to every input text document. Our model is greatly influenced by graph based framework and Semi supervised le...
متن کاملTowards Multi Label Text Classification through Label Propagation
Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text cla...
متن کاملSemi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding
This paper presents a new semi-supervised framework with convolutional neural networks (CNNs) for text categorization. Unlike the previous approaches that rely on word embeddings, our method learns embeddings of small text regions from unlabeled data for integration into a supervised CNN. The proposed scheme for embedding learning is based on the idea of two-view semi-supervised learning, which...
متن کاملImproved Optimized Sentiment Classification On Dynamic Tweets
Real time Sentiment analysis is a subfield of Natural Language Processing concerned with the determination of opinion and subjectivity in a text, which has many applications. In this paper, classifiers for sentiment analysis of user opinion towards through comments and tweets sing Support Vector Machine (SVM) is described. The goal is to develop a classifier that performs sentiment analysis, by...
متن کامل